The refinement panel in SharePoint 2010 is one of the
more complex Web Parts to fully utilize. Its behavior is complex and
can impose significant performance overhead when searching if configured
wrongly. On the other hand, it does also provide a lot of options to
leverage, which are not enabled in the default out-of-the-box
configuration.
This section will focus on how
to easily access and enable some of the most commonly requested
features of the refinement panel. It will also discuss possible
implications and side effects when enabling these features.
1. Adding Refiner Counts
One of the most hyped
features of FAST for SharePoint 2010 is deep refinement and numbered
refiners that show accurate counts, as seen in Figure 1.
In FAST this is enabled in the default out-of-the-box refinement panel.
SharePoint 2010 does not support deep refinement, but it does support
numbered refiners. The functionality is just not enabled per default as
seen in Figure 2.
Numbered refiners, also
known as refiner counts, show the user how many search results refining
by each metadata property value will return. This allows the user to
quickly determine which rementents are most suitable to apply to get the
desired search results. For applications where data set analysis is
relevant, SharePoint 2010 is not suitable, as it does not produce
accurate numbers and does not even necessarily show all refinement
options due to limited precision.
Third-party solutions and
free solutions with numbered refiners have existed since MOSS 2007.
Examples are Ontolica from SurfRay A/S and the CodePlex refiners. Both
offer similar functionality to that of SharePoint 2010, but are more
scalable and easily maintained. Especially the Ontolica solution excels
in refiner management.
An obvious question, then, is
why numbered refiners are not enabled per default in SharePoint 2010.
The likely reasons include their limited precision, which disturbs the
immediate customer experience, and a desire to lower the default load on
the index server. Marketing considerations likely also play a role with
hiding this feature to make FAST more attractive compared to SharePoint
2010 Search.
To add the counts on the
refiners for SharePoint 2010, the Filter Definition category must be
edited as mentioned earlier.
Refiner counts can be added
to individual properties. This allows site administrators to tweak
performance, resource requirements, and usability. Refiner counts make
more sense for some properties than others. Result Type, Site, and
Author are obvious candidates that can benefit from refiner counts,
whereas Modified Date is less obvious.
To add a refiner count for a property, the ShowCounts
attribute should be added as displayed here. Remember to uncheck the
Use Default Configuration check box for the changes to take effect as
seen in Figure 3.
<Category Title="Result Type"
Type="Microsoft.Office.Server.Search.WebControls.ManagedPropertyFilterGenerator"
ShowCounts="Count">
2. Changing the Accuracy Index
Unlike FAST, which has
deep refiners and accurate refiner counts, SharePoint 2010 analyzes a
subset of the total result set when determining refinements to show.
This also applies to the refiner counts. The value that controls the
size of the analyzed subset is called the accuracy index. This value is
set on the refinement panel Web Part.
Having imprecise refiners is
not per definition an issue as long as it is clearly understood by the
users how they work. Often it is very valuable just to know if there are
many or few results contained within a particular refinement. In
situations like that, it does not matter if the refinement yields 300 or
30,000 results. The target is already achieved by the user being able
to evaluate that this refinement does not trim the result set
significantly.
Per default, however, the
accuracy index that controls how many results are analyzed for the
SharePoint 2010 search refinement panel is set to 50, which is too low
for most corporations, as global searches must be expected to yield many
more results than that. It can and should in most instances be
increased to the maximum of 500. This is shown in Figures 4 and 5. Setting any larger number will automatically change that number to 500. This is hard-coded on the Web Part.
Increasing the accuracy index
benefits the users, but it is also a costly operation to calculate
refinements and counts. It is impossible to give rock-solid measurements
on how performance is affected when increasing the accuracy index as it
depends on the query and the composition and size of the total result
set. It is noticeable when searching when the accuracy index is
increased from 50 to 500 for a query that yields a total result set that
exceeds 500. A good rule of thumb is to expect the query to take two to
five times longer, which means the query can take two times as long to
perform when increasing the accuracy index four times. For many smaller
queries, the difference is close to non-existent.
NOTE
A good way of increasing the
usability of refiner counts is to increase the accuracy index to 500
and at the same time promote smaller result sets by introducing limited
scopes. This generally makes the result sets smaller, thus rendering the
refiner counts more accurate. It also increases the query performance.
If a higher accuracy is
required on SharePoint 2010, a custom programmed solution or third-party
solution such as Ontolica from SurfRay A/S is required. Ontolica allows
for higher accuracy than the 500 limit. It is also able to estimate the
total refinement counts based on the size of the total result set, thus
producing much more reliable counts than SharePoint 2010 and with
higher performance too. This is amongst others achieved by quantifying
on the approximate total result count. To counter the performance hit a
user could experience when computing the counts, Ontolica can load the
refiner counts asynchroniosly. It should be noted however that although
custom and 3rd party solutions in
theory can analyze the entire result set and give deep refiners and
counts, this is not practical in most cases as it would severely impact
performance on large result sets.
3. Number of Categories to Display
Refiners are defined in the
Filter Definition category XML file. They appear in the order that they
are defined. A common issue is that added refiners do not show up or
some existing refiner suddenly stops showing, depending on the order in
which they are specified in the XML file. This can happen if the number
of defined refiners exceeds the number of categories to display. Per
default, six categories are displayed at most.
Another reason refiners are
not showing can be the metadata threshold (which is discussed next).
Well, it won't, if there already more refiners visible that are greater
than the number of categories to display. From what I can gather, the
refiners are basically rendered in the order they are in the XML, so if
you put your refiner at the end and already the refinement panel has
rendered the maximum amount, it won't be seen by users.
In the default Filter
Definition category XML file, six refiners are defined per default.
These are Result Type, Site, Author, Modified Date, Product Category,
and Company Size.
If the number of six refiners
does not fit in the custom branded page layout, it can be expanded or
reduced by changing the number of categories to display. No limit on how
many refiners can be added has been found so far. Changing the number
to two now displays only the first two feasible refiners as seen in Figure 6.
Although it should be
fairly self-explanatory how to add new refiners, one issue has been
raised a number of times: what if the order of the refiners in the XML
file should differ from their priority to show? Currently this is not
supported in SharePoint 2010. To achieve this, a third-party solution
such as Ontolica from SurfRay A/S can be used. This supports such
advanced behavior. Alternatively the CodePlex refiners can be modified
to also achieve this, but that requires a significant portion of
re-programming.
4. Metadata Threshold
One of the more difficult
settings to work with is the metadata threshold. A common
misunderstanding is that this value defines how many different values
must be available for a given property. This is not correct. It actually
defines how many times the property is found in any of the analyzed
search results for the refiner to display.
Assume that the metadata threshold is set to 5 for the Result Type refiner. This refiner operates on the FileExtension
property. If the analyzed search result set contains only two Word
documents and two PDF files, then this refiner will not show. This then
also affects how the "Number of categories to display" acts. If a total
of ten refiners are defined and the first four do not show, then
refiners five through nine will show. If search results often cause most
refiners not to be shown due to the metadata threshold, it should be
considered to create new refiners targeted at those situations to allow
users to always do meaningful refinements. For this reason, it is not
advisable to set the metadata threshold too low. Instead new refiners
should be introduced.
The metadata threshold value is a byte, which means that the range is [0:255]. In the following example shown in Figures 7 and 8,
the number of categories to display is set to 2 and the threshold is
changed from 5 to 250 on the Result Type refiner to illustrate how it
works.
5. Number of Characters to Display
This is not really an advanced topic but rather a hint of why to use a generally not-used setting on the refinement panel.
The "Number of characters to
display" sets how many characters of each refinement option are
displayed. The default value is 16, which is often too short to show
meaningful information, especially for paths.
It is not uncommon to have names that are longer than 16 characters (Figure 9) or paths such as http://subdomain.domain.org/something/something/something/.
In this case, the limit of 16 is actually not enough to display the
actual file and site names on the path. Instead it will be displayed
with "http://subdomain..." as
SharePoint refiners automatically apply the "..." to indicate a partial
text. Setting this value too large is not a good idea either, as it
wastes screen space. But in general it should be set to some value in
the area of 25 to 30, which fits nicely with the standard search layout.